Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter

نویسنده

  • Zeyuan Allen-Zhu
چکیده

Given a nonconvex function f(x) that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue −σ of the Hessian. This parameter σ captures how strongly nonconvex f(x) is, and is analogous to the strong convexity parameter for convex optimization. At least in theory, our methods outperform known (offline) methods for a range of parameter σ, and can also be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ0 so that the currently fastest methods for σ > σ0 and for σ < σ0 have different behaviors: the former scales with n 2/3 and the latter scales with n.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation

Asynchronous algorithms have attracted much attention recently due to the crucial demands on solving large-scale optimization problems. However, the accelerated versions of asynchronous algorithms are rarely studied. In this paper, we propose the “momentum compensation” technique to accelerate asynchronous algorithms for convex problems. Specifically, we first accelerate the plain Asynchronous ...

متن کامل

Stochastic Conjugate Gradient Algorithm with Variance Reduction

Conjugate gradient methods are a class of important methods for solving linear equations and nonlinear optimization. In our work, we propose a new stochastic conjugate gradient algorithm with variance reduction (CGVR) and prove its linear convergence with the Fletcher and Revves method for strongly convex and smooth functions. We experimentally demonstrate that the CGVR algorithm converges fast...

متن کامل

Variance Reduction for Faster Non-Convex Optimization

We consider the fundamental problem in non-convex optimization of efficiently reaching a stationary point. In contrast to the convex case, in the long history of this basic problem, the only known theoretical results on first-order non-convex optimization remain to be full gradient descent that converges in O(1/ε) iterations for smooth objectives, and stochastic gradient descent that converges ...

متن کامل

Variance-Reduced Proximal Stochastic Gradient Descent for Non-convex Composite optimization

Here we study non-convex composite optimization: first, a finite-sum of smooth but non-convex functions, and second, a general function that admits a simple proximal mapping. Most research on stochastic methods for composite optimization assumes convexity or strong convexity of each function. In this paper, we extend this problem into the non-convex setting using variance reduction techniques, ...

متن کامل

Stochastic Training of Neural Networks via Successive Convex Approximations

This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA) techniques. The basic idea is to iteratively replace the original (non-convex, highly dimensional) learning problem with a sequence of (strongly convex) approximati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017